Matthew Crook's profile

Translating Midjourney to Stable Diffusion II

I wanted to see how image generation compares between Midjourney and Stable Diffusion. Midjourney has a showcase of the top submitted images. (It updates daily, so the images I started with are no longer displayed.) I borrowed the prompts from 20 images and submitted them, unedited to Stable Diffusion 2.1. I also downloaded the images and submitted them to Clip Interrogator 2.4 and submitted those prompts (with minor editing) to Stable Diffusion 2.1, too. Finally, I tried comparing a CFG scale of 7.5 and 15 for each prompt. (For more on how CFG scale affects the final image, see here.)

Here are the results for the final ten. In each case, the first row of images is with CFG = 7.5 and the second row of images is with CFG = 15. Since I can't share the original images (they are no longer online and I don't have the original creators' permission to share them), I'll just tell you how well Stable Diffusion did.

XI
Original Prompt 11: topography map of many winding rivers like ribbons thrown across the earth, intricate, dark color palette


Generated Prompt 11: there is a very colorful abstract painting with orange and blue, redshift Houdini, Peter Tarka, fractal Baroque, 3D 8K, inspired by James Jean, beautiful insanely detailed, inspired by Ludovit Fulla, disconnected shapes, paper texture, 1968, by Emma Andijewska, stylized layered textures

Conclusion: The original image showed swirls of color with bubbles of different colors floating above it. The swirls almost looked like quilling. This time Stable Diffusion produced images with more vibrant colors than the original, but the overall designs were less interesting. Stable Diffusion came closer to matching the original image with the generated prompts than it did with the original prompt. 

XII
Original Prompt 12: spiral galaxy is the center of the universe, the gold galaxy systems circulating around the gilded galaxy, no letters, relief in gold, abstract, minimalism


Generated Prompt 12: painting of a spiral of fire and water with a black background, tooth wu, Quixel megascans, wlop, coherent painting, gold striated swirling finish, distant twinkling stars, art, psionic, strong blue and orange colors, photorealistic, Annato Finnstark

Conclusion: The original image was a galaxy that resembled the etching in a cymbal but also looked like an image from NASA. The original also had colors that were more vibrant than anything Stable Diffusion produced, even with the CFG scale set to 15. In this case the original prompt outperformed the generated prompt since the images from the generated prompt look rather cartoonish. 

XIII
Original Prompt 13: Unnerving Hyperdetailed Horror Sketch art, In the style of Junji Ito and Naoto Hattori, Beksinski, Surrealism, artistic darkness


Generated Prompt 13: there are two blue monsters with big eyes sitting behind a log,legless, armless, Laurie Greasley and James Jean, Beksinkski, ffffound, gif, the artist is Charles Burns, magic eye, hi-fructose art magazine, John Kenn Mortensen, structure, Kyle Lambert, Masahiro Ito

Negative Prompt 13: legs, arms, hands, feet, necks

Conclusion: The original image showed blue, one-eyed, fuzzy monsters (with no arms or legs) in a dark forest, drawn with a gritty cartoon style and muted colors. Stable Diffusion produced roughly similar colors, but the forest is not nearly as dark. Also, I was unable to consistently get armless, legless monsters despite using a negative prompt. Again, Stable Diffusion came closer to matching the original image with the generated prompts than it did with the original prompt.

XIV
Original Prompt 14: cel-shaded animation, Dark Fantasy tower with a bridge over a dry canyon river gorge, predawn gloaming, outlined minimalism, style of Cuno Amiet, Ferdinand Hodler, Harriet Lee-Merrion, strong bold lines


Generated Prompt 14: there is a poster of a bridge over a river in a canyon, epic portrait illustration, train, full color illustration, Firewatch, tonalism illustration, in style of Stanislav Vovchuk, Washington, Tom Richmond illustration, concept illustration, high detail illustration, no gradients, 1128x191 resolution, a beautiful artwork illustration
XV
Original Prompt 15: swirling brushstrokes and vibrant, contrasting colors such as electric blue, fiery orange, and vivid purple. The image evokes the mesmerizing and ever-changing patterns of a kaleidoscope


Generated Prompt 15: painting of a colorful swirl with a black background, fractal thunder Dan Mumford, inspired by Cyril Rolando, sacral chakra, 8K resolution, oil on canvas, style of Duelyst, inspired by Amanda Sage, painting of, style of Alex Grey, stained, loss of inner self, Olga Buzova, an abstract

Conclusion: The original image showed a colorful swirl that resembled a pinwheel, composed of broad, colorful brushstrokes. Stable Diffusion produced roughly similar colors when the CFG scale was set to 15, but the lines are much thinner. Again, Stable Diffusion came closer to matching the original image with the generated prompts than it did with the original prompt.

XVI
Original Prompt 16: a beach scene, in the style of bold chromaticity, exotic, spectacular backdrops, colorful arrangements, romantic gestures, joyful celebration of nature, sunset amazing light, with copy big space, zoom out, zoom


Generated Prompt 16: sunset on the beach with palm trees and waves, low detailed, digital painting, Pixiv, 8K, cliffside, James Edmiston, rich bright sunny colors, redpink sunset, far view, Hawaii, long view, detailed image, kiss, vector artwork, the colors of the sunset, on the ocean

Conclusion: The original image was a digital painting of a beach at sunset, with soft colors. Stable Diffusion produced roughly similar colors when the CFG scale was set to 15, but things have outlines whereas they did not in the original image. That said, this was another success (mostly), since the generated prompt produced images that were highly similar. Again, Stable Diffusion came closer to matching the original image with the generated prompts than it did with the original prompt, since the original prompt returned photorealistic results.

XVII
Original Prompt 17: Camp Gouge and the woods map print by Wilkie Parker, in the style of Milton Glaser, made of crystals, the Helsinki school, cardboard, light black and rainbow, celebration of rural life, Contax T2


Generated Prompt 17: painting of a colorful landscape with a river and a village, cute pictoplasma, boreal forest, poster shot, Montana, with everything in its place, Olympics, sticker of a home in the forest, cut, game board, camp, arm, full of joy, risographic, ELS, K

Conclusion: The original image was a cutesy, almost childish painting of a town in the mountains with a river flowing past it. Stable Diffusion produced colors that were as bright as the original, but didn't use as much variety of colors. Again, Stable Diffusion came closer to matching the original image with the generated prompts than it did with the original prompt, since the original prompt returned mostly maps of the USA.

XVIII
Original Prompt 18: illustration of a boy in the grass on a house with a window, in tall green grass , blue sky, with a gentle breeze blowing through it, in the style of Elsa Beskow, dark green and orange, aerial view, calming, duckcore


Generated Prompt 18: painting of a house in a field with a bird flying in the sky, girl walking between dunes, by Don Arday, in a large grassy green field, by François Quesnel, looking at the ocean, boys, high resolution details, fargo, waiting, Celtics, very very well detailed image, Wayne, painted in high resolution, Russ Abbott, summer street near a beach

Conclusion: The original image showed a little boy standing next to a bucolic cottage by the sea, painted in an impressionistic style. Stable Diffusion produced colors that were a bit brighter than the original, but utterly failed to match the painting style. Both the original prompt and the generated prompt deviated from the original image, but in different ways.

XIX
Original Prompt 19: A planet in the sky, the universe, highly detailed clouds. Fantasy, atmosphere fantasy sky, highly detailed clouds, two figures, (boy and girl) back, grass, fireflies


Generated Prompt 19: anime scene of two people standing on a hill looking at a planet, 4K vertical wallpaper, amazing sky, meteor, art cover, beautiful masterpiece, absolutely outstanding image, cinematic, amazing photo, anime, orange planet, beautiful photo, falling star on the background, beautiful composition, photo of a beautiful

Conclusion: The original image was a boy and a girl holding hands and looking up at a sky dominated by a giant planet or moon, with a Midjourney-type style. Stable Diffusion produced colors that were pretty similar to the original, but the image overall was less stylized. Again, Stable Diffusion came closer to matching the original image with the generated prompts than it did with the original prompt.

XX
Original Prompt 20: minimalist topography, cartography, Mars's Cerberus spot, textured paper


Generated Prompt 20: there is a red and white painting with a white circle on it, Martian sands background, painted texture maps, war-torn, orange tone, cartographic, illustration, blood, Terragen, 64x64, cracked walls, dots abstract, heavy vignette, genji, textless, wild west background, earthy

Conclusion: The original image was an abstract painting of part of a red circle on an off-white background with cracks and dark red splotches. With the generated prompt, Stable Diffusion produced colors that were pretty similar to the original, but the style was slightly different. Again, Stable Diffusion came closer to matching the original image with the generated prompts than it did with the original prompt, which produced cut paper or folded paper rather than textured paper.

Final Thoughts
Conclusion: For the most part the original prompts for Midjourney produced very different images from Stable Diffusion. Using Clip Interrogator to generate prompts resulted in images that were much closer to the Midjourney images in terms of subject and, to a lesser extent, color scheme. But CLIP Interrogator + Stable Diffusion 2.1 almost always failed to recreate the artistic style (and sometimes medium) of the original image from Midjourney.


Prompts were generated using CLIP Interrogator 2.4.
Illustrations were drawn using Stable Diffusion 2.1.
Translating Midjourney to Stable Diffusion II
Published:

Translating Midjourney to Stable Diffusion II

Published: